Skip to content

Conversation

@dhardy
Copy link
Member

@dhardy dhardy commented Dec 16, 2025

  • Added a CHANGELOG.md entry

Summary

Move:

  • traits Word, Sealed into new mod word
  • mod le to utils

Replace le fns:

Relax restrictions on <G as Generator>::Output in Debug impl for BlockRng (reverting the restriction added in #36) since this complicates ReseedingRng's Debug impl and we don't really need to print the index.

Details

This implements much of the code changes from #24 to the le module with some differences (see above).

Documentation is expanded slightly (one example) but omits much of the extras. Perhaps the extra doc examples should be turned into tests?

Benches

Relative to the current master branch, benchmarks of rust-random/rand#1690 show around 8% faster random_bytes/thread (due to inlining?) and 8% slower random_u64/thread (surprising since the relevant code, BlockRng::next_u64_from_u32, should be unaffected).

@dhardy dhardy requested a review from newpavlov December 16, 2025 14:26
@dhardy dhardy changed the title Replace mod le with `utils Replace mod le with utils Dec 16, 2025
@dhardy
Copy link
Member Author

dhardy commented Dec 16, 2025

s$ taskset -c 4 cargo bench --bench generators thread -- --baseline rand_core_master
    Finished `bench` profile [optimized] target(s) in 0.04s
     Running benches/generators.rs (/home/dhardy-extra/.cache/cargo-build/release/deps/generators-f99e7648f7fd7691)
random_bytes/thread     time:   [306.03 ns 306.05 ns 306.08 ns]
                        thrpt:  [3.1158 GiB/s 3.1161 GiB/s 3.1163 GiB/s]
                 change:
                        time:   [-7.8181% -7.7017% -7.5409%] (p = 0.00 < 0.05)
                        thrpt:  [+8.1560% +8.3444% +8.4811%]
                        Performance has improved.
Found 28 outliers among 100 measurements (28.00%)
  12 (12.00%) low severe
  2 (2.00%) low mild
  5 (5.00%) high mild
  9 (9.00%) high severe

random_u32/thread       time:   [1.2708 ns 1.2711 ns 1.2714 ns]
                        thrpt:  [2.9302 GiB/s 2.9308 GiB/s 2.9314 GiB/s]
                 change:
                        time:   [-0.3913% -0.3684% -0.3449%] (p = 0.00 < 0.05)
                        thrpt:  [+0.3461% +0.3698% +0.3928%]
                        Change within noise threshold.
Found 173 outliers among 1000 measurements (17.30%)
  98 (9.80%) low severe
  21 (2.10%) low mild
  21 (2.10%) high mild
  33 (3.30%) high severe

random_u64/thread       time:   [2.1697 ns 2.1699 ns 2.1701 ns]
                        thrpt:  [3.4333 GiB/s 3.4336 GiB/s 3.4340 GiB/s]
                 change:
                        time:   [+8.4404% +8.4715% +8.5023%] (p = 0.00 < 0.05)
                        thrpt:  [-7.8361% -7.8099% -7.7834%]
                        Performance has regressed.
Found 173 outliers among 1000 measurements (17.30%)
  163 (16.30%) low mild
  3 (0.30%) high mild
  7 (0.70%) high severe

@dhardy
Copy link
Member Author

dhardy commented Dec 17, 2025

I continue to find microbenchmarks consistently inconsistent. Compared to the rand_core master branch:

random_bytes/chacha8    time:   [233.21 ns 233.43 ns 233.67 ns]
                        thrpt:  [4.0813 GiB/s 4.0855 GiB/s 4.0894 GiB/s]
                 change:
                        time:   [-7.6660% -7.5524% -7.4426%] (p = 0.00 < 0.05)
                        thrpt:  [+8.0411% +8.1694% +8.3025%]
                        Performance has improved.
random_bytes/chacha12   time:   [314.50 ns 314.84 ns 315.19 ns]
                        thrpt:  [3.0257 GiB/s 3.0291 GiB/s 3.0324 GiB/s]
                 change:
                        time:   [-2.0189% -1.8551% -1.6651%] (p = 0.00 < 0.05)
                        thrpt:  [+1.6933% +1.8902% +2.0605%]
                        Performance has improved.
random_bytes/chacha20   time:   [439.23 ns 439.65 ns 440.11 ns]
                        thrpt:  [2.1669 GiB/s 2.1691 GiB/s 2.1713 GiB/s]
                 change:
                        time:   [-0.5591% -0.3701% -0.1649%] (p = 0.00 < 0.05)
                        thrpt:  [+0.1652% +0.3715% +0.5623%]
                        Change within noise threshold.
random_u64/chacha8      time:   [1.6599 ns 1.6606 ns 1.6613 ns]
                        thrpt:  [4.4847 GiB/s 4.4866 GiB/s 4.4886 GiB/s]
                 change:
                        time:   [+12.571% +12.628% +12.684%] (p = 0.00 < 0.05)
                        thrpt:  [-11.256% -11.212% -11.167%]
                        Performance has regressed.
random_u64/chacha12     time:   [2.1342 ns 2.1351 ns 2.1361 ns]
                        thrpt:  [3.4879 GiB/s 3.4895 GiB/s 3.4911 GiB/s]
                 change:
                        time:   [+7.0265% +7.0894% +7.1515%] (p = 0.00 < 0.05)
                        thrpt:  [-6.6742% -6.6201% -6.5652%]
                        Performance has regressed.
Found 1 outliers among 1000 measurements (0.10%)
  1 (0.10%) high mild
random_u64/chacha20     time:   [3.2887 ns 3.2901 ns 3.2914 ns]
                        thrpt:  [2.2636 GiB/s 2.2645 GiB/s 2.2655 GiB/s]
                 change:
                        time:   [+7.1823% +7.2344% +7.2841%] (p = 0.00 < 0.05)
                        thrpt:  [-6.7895% -6.7464% -6.7010%]
                        Performance has regressed.
Found 1 outliers among 1000 measurements (0.10%)
  1 (0.10%) high severe

Also random_u32/pcg32 and random_u64/pcg64 getting 5% and 6% slower respectively (should be entirely unaffected), so maybe the CPU frequency is insufficiently stable.

@dhardy dhardy merged commit 29b1630 into master Dec 17, 2025
13 checks passed
@dhardy dhardy deleted the utils branch December 17, 2025 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants